51 research outputs found

    An efficient gene selection method for high-dimensional microarray data based on sparse logistic regression

    Get PDF
    Gene selection in high-dimensional microarray data has become increasingly important in cancer classification. The high dimensionality of microarray data makes the application of many expert classifier systems difficult.To simultaneously perform gene selection and estimate the gene coefficientsin the model, sparse logistic regression using L1-norm was successfully applied in high-dimensional microarray data. However, when there are highcorrelation among genes, L1-norm cannot perform effectively. To addressthis issue, an efficient sparse logistic regression (ESLR) is proposed. Extensive applications using high-dimensional gene expression data show that ourproposed method can successfully select the highly correlated genes. Furthermore, ESLR is compared with other three methods and exhibits competitiveperformance in both classification accuracy and Youdens index. Thus, wecan conclude that ESLR has significant impact in sparse logistic regressionmethod and could be used in the field of high-dimensional microarray datacancer classification

    Diagnostic in Poisson Regression Models

    Get PDF
    Poisson regression model is one of the most frequently used statistical methods as a standard method of data analysis in many fields. Our focus in this paper is on the identification of outliers, we mainly discuss the deviance and Pearson  as diagnostic statistics in identification. Simulation and real data are presented to assess the performance of the diagnostic statistics

    Almost unbiased ridge estimator in the zero-inated Poisson regression model

    Get PDF
    The zero-inflated Poisson regression (ZIP) model is a very popular model for count data that have extra zeros. In some situations, the count data are correlated and so multicollinearity exists among the explanatory variables. Thus, the traditional maximum likelihood estimator (MLE) becomes not a reliable estimator because the mean squared error (MSE) becomes inflated. The ridge estimator (RE) is used to overcome this problem. In this work, an almost unbiased ridge estimator for the ZIP model (AUZIPRE) is proposed to tackle the multicollinearity problem in count data. We investigate the behavior of the proposed estimator using a simulation study. Using the MSE measure, the results of the proposed estimator are compared with those of the RE and the MLE. Furthermore, we apply the proposed estimator on a real dataset. The results show that the performance of AUZIPRE outperforms for that of the RE and the MLE in the existing of the multicollinearity among the count data in the ZIP model.Publisher's Versio

    Some almost unbiased ridge regression estimators for the zero-inflated negative binomial regression model

    Get PDF
    Zero-inflated negative binomial regression (ZINB) models are commonly used for count data that show overdispersion and extra zeros. The correlation among variables of the count data leads to the presence of a multicollinearity problem. In this case, the maximum likelihood estimator (MLE) will not be an efficient estimator as the value of the mean squared error (MSE) will be large. Several alternative estimators, such as ridge estimators, have been proposed to solve the multicollinearity problem. In this paper, we propose an estimator called an almost unbiased ridge estimator for the ZINB model (AUZINBRE) to solve the multicollinearity problem in the correlated count data. The performance of the AUZINBRE is investigated using a Monte Carlo simulation study. The MSE is used as a measure to compare the results of the proposed estimators with those of the ridge estimators and the MLE. In addition, the AUZINBRE is applied to a real dataset

    A QSAR classification model of skin sensitization potential based on improving binary crow search algorithm

    Get PDF
    Classifying of skin sensitization using the quantitative structure-activityrelationship (QSAR) model is important. Applying descriptor selection isessential to improve the performance of the classification task. Recently, abinary crow search algorithm (BCSA) was proposed, which has been successfully applied to solve variable selection. In this work, a new time-varyingtransfer function is proposed to improve the exploration and exploitation capability of the BCSA in selecting the most relevant descriptors in QSAR classification model with high classification accuracy and short computing time.The results demonstrated that the proposed method is reliable and can reasonably separate the compounds according to sensitizers or non-sensitizerswith high classification accuracy

    Applying Penalized Binary Logistic Regression with Correlation Based Elastic Net for Variables Selection

    Get PDF
    Reduction of the high dimensional classification using penalized logistic regression is one of the challenges in applying binary logistic regression. The applied penalized method, correlation based elastic penalty (CBEP), was used to overcome the limitation of LASSO and elastic net in variable selection when there are perfect correlation among explanatory variables. The performance of the CBEP was demonstrated through its application in analyzing two well-known high dimensional binary classification data sets. The CBEP provided superior classification performance and variable selection compared with other existing penalized methods. It is a reliable penalized method in binary logistic regression

    Restricted ride estimator in the Inverse Gaussian regression model

    Get PDF
    The inverse Gaussian regression (IGR) model is a well-known model in application when the response variable positively skewed. Its parameters are usually estimated using maximum likelihood (ML) method. However, the ML method is very sensitive to multicollinearity. Ridge estimator was proposed in inverse gaussian regression model. A restricted ridge estimator is proposed. Simulation and real data example results demonstrate that the proposed estimator is outperformed ML and inverse Gaussian ridge estimator

    Non-transformed principal component technique on weekly construction stock market price

    Get PDF
    The fast-growing urbanization has contributed to the construction sector be- coming one of the major sectors traded in the world stock market. In general, non- stationarity is highly related to most of the stock market price pattern. Even though stationarity transformation is a common approach, yet this may prompt to originality loss of the data. Hence, the non-transformation technique using a generalized dynamic principal component (GDPC) were considered for this study. Comparison of GDPC was performed with two transformed principal component techniques. This is pertinent as to observe a larger perspective of both techniques. Thus, the latest weekly two-years observations of nine constructions stock market price from seven different countries were applied. The data was tested for stationarity before performing the analysis. As a re- sult, the mean squared error in the non-transformed technique shows eight lowest values. Similarly, eight construction stock market prices had the highest percentage of explained variance. In conclusion, a non-transformed technique can also present a better result outcome without the stationarity transformation

    Restricted ride estimator in the Inverse Gaussian regression model

    Get PDF
    The inverse Gaussian regression (IGR) model is a well-known model in application when the response variable positively skewed. Its parameters are usually estimated using maximum likelihood (ML) method. However, the ML method is very sensitive to multicollinearity. Ridge estimator was proposed in inverse gaussian regression model. A restricted ridge estimator is proposed. Simulation and real data example results demonstrate that the proposed estimator is outperformed ML and inverse Gaussian ridge estimator

    Tuning parameter selectors for bridge penalty based on particle swarm optimization method

    Get PDF
    The bridge penalty is widely used as a penalty for selecting and shrinking predictors in regression models. Although its effectiveness is sensitive to the parameters you decide to use for shrinking and adjusting. The shrinkage and tuning parameters of the bridge penalty are chosen concurrently, and a continuous optimization process called particle swarm optimization is proposed as a means to do this. If implemented, the proposed method will greatly facilitate regression modeling with superior prediction performance. The results show that the proposed method is effective in comparison to other well-known methods, but this varies greatly depending on the simulation setup and the real data application
    corecore